Multiple mitogenomes indicate Things Fall Apart with Out of Africa or Asia hypotheses for the phylogeographic evolution of Honey Bees (Apis mellifera)

Previous morpho-molecular studies of evolutionary relationships within the economically important genus of honey bees (Apis), including the Western Honey Bee (A. mellifera L.), have suggested Out of Africa or Asia origins and subsequent spread to Europe. I test these hypotheses by a meta-analysis of complete mitochondrial DNA coding regions (11.0 kbp) from 22 nominal subspecies represented by 78 individual sequences in A. mellifera. Parsimony, distance, and likelihood analyses identify six nested clades: Things Fall Apart with Out of Africa or Asia hypotheses. Molecular clock-calibrated phylogeographic analysis shows instead a basal origin of A. m. mellifera in Europe ~ 780 Kya, and expansion to Southeast Europe and Asia Minor ~ 720 Kya. Eurasian bees spread southward via a Levantine/Nilotic/Arabian corridor into Africa ~ 540 Kya. An African clade re-established in Iberia ~ 100 Kya spread thereafter to westerly Mediterranean islands and back into North Africa. Nominal subspecies within the Asia Minor and Mediterranean clades are less differentiated than are individuals within other subspecies. Names matter: paraphyletic anomalies are artefacts of mis-referral in GenBank of sequences to the wrong subspecies, or use of faulty sequences, which are clarified by inclusion of multiple sequences from available subspecies.

www.nature.com/scientificreports/ lineages. Molecular studies based on analyses of various components of the nuclear genome (see Discussion) agree broadly on rearrangement of these as an MAOC backbone, rooted so as to offer alternative theories of origin, either "Thrice Out of Africa" 9 [root within A] or "Thrice Out of Asia" 10 [root between A & O]. These studies are based on fewer than a dozen of the available subspecies of A. mellifera. Other phylogeographic conclusions have been reached from analyses of complete mitochondrial DNA (mtDNA) genomes from single sequences per subspecies 11,12 . In particular, Tihelka and co-workers in this journal 12 have recently provided a meta-analysis of mtDNA, based on a substantial body of data from Boardman et al. 11 and new methods of phylogenetic inference. They reached yet another alternative hypothesis, a Middle Eastern/North African origin of A. mellifera. Tihelka and co-workers emphasize the need for a reliable backbone phylogeny for A. mellifera L. to understand the evolution of the subspecies, including its geographic origin and the development of adaptative differences among subspecies that contribute to their ecological and commercial success 6 . My editorial review of the use of an mtDNA spacer region between the CO1 and CO2 genes as a tool for discrimination of individual bees within and among subspecies of A. mellifera [cf. Ref. 13 ] found no comprehensive phylogenetic evaluation that included multiple mtDNA genome variants within multiple subspecies 11,12,[14][15][16][17][18][19][20][21][22][23][24][25][26] . Unexpectedly, my preliminary assembly of multiple sequences assigned to A. m. mellifera in GenBank, taken at face value, seemed to indicate extensive paraand (or) polyphyly and (or) extraordinary genetic diversity within this subspecies and with respect to other taxa. Table 1. Apis taxonomy, GenBank accessions, & distribution of species and subspecies for which complete mitogenome sequences are available 2 . The subspecies epithet iberica is a junior synonym for the Caucasian form, and as such is pre-empted for the Iberian Honey Bee. The correct trinomial is A. m. iberiensis Engel, 1999. The subspecies epithet for the Caucasian Honey Bee is given variously as caucasica or caucasia or: the latter is correct. www.nature.com/scientificreports/ In the twenty-first century CE and the third century L., the importance of classical "alpha" taxonomy 27 in finding, describing, and naming taxa remains critical. I provide here a phylogeographic re-evaluation of the evolution of Apis, based on a meta-analysis of the 13 coding regions (11,006 bp) from available mitogenomes in nine species of the genus Apis, including 78 individuals from 22 subspecies of the type species A. mellifera L. This meta-analysis indicates that Things Fall Apart with Out of Africa, Asia, and Middle East hypotheses, in favor of a European origin.

Materials and methods
Sequence data. Complete mitochondrial DNA genomes of taxa within the genus Apis were identified in the GenBank Taxonomic library by selection of "Genome"-flagged sequences within the display of species and subspecies. This was supplemented by a search of the "Nucleotide" library with the search term "Apis mitochondrion" through the end of October 2022. Table 1 lists GenBank accession numbers for sequences from nine species of Apis and 22 subspecies of A. m. mellifera as used here, together with the subspecies' geographic provenance.
The mitogenomic sequence of a curated Norwegian specimen of A. m. mellifera (KY926884) was used as the alignment reference. Alignment was done by eye with the help the MEGA X program 28 . The Apis mitochondrial coding genome comprises 13 genes over 11,043 bps ( Table 2). The GenBank annotations of various authors delimit coding regions with slightly different 5′ and 3′ endpoints, especially among species: spaces were inserted between coding regions so as to preserve open reading frames. The light-strand coding regions of the ND5, ND4, ND4L, and ND1 genes were included in their sense-strand equivalent 5′ → 3′ coding directions.
The 3′ region of the ND4 heavy-strand equivalent is difficult to align across subspecies, and a 37 bp region was excluded from all analyses. Ambiguous base calls (mostly W for A/T, flanked by AT bases) in several sequences have been tacitly resolved, such that these positions are invariant over all sequences. Several autapomorphic insertions of single triplets among species and subspecies have been tacitly removed. The coding region mitogenome sequence for a Bumble Bee, Bombus ignitus (GenBank NC010967), was used as the outgroup for interspecies comparisons. Honey bees and Bumble Bees occur in the tribes Apini and Bombini, respectively, of the subfamily Apinae. The alignment of coding regions between Bombus and Apis spp. is in some areas speculative: experimental inclusion or exclusion of ambiguous regions does not affect the inferred branching order of Apis species, and does not materially affect its statistical support. The consensus alignment for analysis comprises 11,006 bases within A. mellifera subspecies, and 11,070 bases across Apis and Bombus. Phylogenomic analysis. I performed three forms of phylogenetic analyses with MEGA X 28 . Maximum Parsimony (MP) analyses was performed with all nucleotide positions equally weighted, and SPR search. Maximum Parsimony analysis is foregrounded to identify inter-subspecific SNPs and to provide intra-specific patristic differences within Apis, and intra-versus inter-subspecific differences within A. mellifera. Maximum Likelihood analyses (ML) was performed with the general time reversible (GTR) model allowing for invariant sites and nearest-neighbor interchange (NNI) search. For molecular clock calculations, I used a fixed rate of 0.0115 substitutions/site/Myr on node distances calculated in a linearized Maximum Likelihood tree RelTime-ML model in MEGA (see "Discussion"). Neighbor Joining (NJ) analysis was performed on counts of differences, www.nature.com/scientificreports/ with the maximum composite likelihood model, and SPR search. Statistical confidence in all three methods was estimated from 3000 bootstrap replications each, under the same conditions as the main search. Concern has been expressed 12 as to the effect of inclusion of third triplet-position bases on phylogenetic inference. I examined the effect of exclusion of these data (model P12 and modifications thereof 12 ) in all three analyses. Initial assessment of the available mitogenomes for A. mellifera identified 16 Arabian bees in four monophyletic clades assigned to A. m. mellifera GenBank accessions MT745901-MT745915 14 that comprise sets of 5, 4, 4, and 3 identical sequences each: one sequence was included from each set, and the rest tacitly. A set of 11 mitogenomes from Kenyan bees (KJ396181-KJ396191) 15 assigned in GenBank to A. m. mellifera are used here with their correct subspecies identifications as provided by Z. Fuller (pers. comm.). A series of 20 sequences attributed to A. m. capensis (MG552683-692) or A. m. scutellata (MG552693-702) was retained in full. GenBank accessions KY926882 and KY926883 are attributed to A. m. syriaca and A. m. intermissa, respectively. Inspection of both sequences indicates numerous anomalies in coding region that produce branch attraction between the two and large phylogenetic separations from other GenBank accessions assigned to the same subspecies (Supplementary Fig. S1). Both are excluded from primary analyses here.
All figures were drawn with Corel PaintShop Pro 2023 (version 25). Figure 1a shows the MP analysis for nine species of Apis (Apini) with Bombus ignitus (Bombini) as outgroup.

Results
Out of seven interspecies nodes, six are supported by > 94% of bootstraps. Analyses with the NJ and ML methods give identical branching orders and substantially similar bootstrap support. Within Apis, the Dwarf Honey Bee pair [A. florea + A. andreniformis] is the outgroup to the remaining taxa. Within the Cavity-Nesting species, A. mellifera is the sister species to the others, and its phenetic difference from other cavity-nesting species is greater than that between the Dwarf and Giant Honey Bee pairs. Figure 1b shows the MP analysis of Apis species alone including ten subspecies of Apis mellifera: A. m. mellifera is sister to the remaining subspecies. Note that A. florea and A. mellifera, the only species whose native ranges overlap in the Middle East, are in separate morpho-behavioral and molecular clades. Figure 2 shows the schematic MP analysis of 66 (+ 12 tacit) sequences from 22 subspecies of A. mellifera, with multiple sequences included as described in Methods. The tree is rooted with A. m. mellifera as obtained from the analysis shown in Figs www.nature.com/scientificreports/ With a fixed clock rate of 0.0115 subs/site/Myr (see "Discussion"), all node distances in linearized ML trees can be converted directly to their age of divergence in a molecular clock (Fig. 4). For example, given a calculated distance to the basal node in Fig. 4 of 0.008234 subs/site (Relative Time), the node is dated at (0.008234 subs/ site)/(0.0000115 subs/site/Kyr) = 716 Ka. For Bombus and other Apis species included as in Fig. 1, the molecular clock indicates a late Miocene radiation (6-11 Mya) of morpho-behavioral groups within Apis ( Supplementary  Fig. S4), and a European origin of A. mellifera 780 Kya in the late Pleistocene. The molecular clock for subspecies of A. mellifera is shown in Fig. 4 for a subset of 37 sequences included in Fig. 2a, with single representative sequences for A. m. scutellata and A. m. capensis.

Discussion
"I have just been thinking, and I have come to a very important decision. These are the wrong sort of bees."-Winnie the Pooh.
A mitogenomic phylogeography of species of Apis and subspecies of Apis mellifera. The mitogenomic analysis confirms with high confidence previous inferences as to morpho-behavioral relationships among nine species as Dwarf, Giant, and Cavity-Nesting Honey Bees (Figs. 1a,b). Within the Apini, rooting with  Mitogenomic sequences assigned to the taxon A. m. mellifera in GenBank occur in nine lineages collectively made paraphyletic by the placement of other subspecies. The phylogenetic analysis indicates that most of these apparent anomalies are artefacts of miss-assignment of individual sequences to the type subspecies. As noted, the Kenyan bee series when correctly referred to subspecies accords with the cladistic arrangement described here. The Arabian bee series referred to A. m. mellifera in GenBank falls phylogenetically into two distinct clades, one as part of the basal A. m. mellifera sensu lato and the other A. m. jemenitica sensu lato, along with Kenyan bees reassigned to the same subspecies. Another two Kenyan sequences assigned to A. m. mellifera are more closely related to A. m. simensis and A. m. unicolor, respectively, which suggests they are in fact members of those two subspecies. The latter extends the subspecies range beyond insular Malagasy.
A revised phylogeographic hypothesis for the evolution of A. mellifera. The received picture of the evolution of A. mellifera has been an "Out of Africa" model 9 , with as many as three Recent excursions, originally to Europe, more recently to Iberia, and in historic times to South America as so-called "Killer" or "Africanized" Bees, which resulted from accidental crossing of imported A. m. scutellata queen with local A. m. ligustica drones 7 . This has been challenged more recently by models of "Out of Asia" 10 or North African/Middle Eastern 12 origins. The alternative phylogeographic clade structure presented here indicates that Things Fall       Supplementary Fig. S2). The clock is calibrated from the mean linearized nucleotide subs/site distances to each node (Relative Time) at 0.0115 subs/site/Myr (see text for sample calculation). See Supplementary Fig. S4 for the clock of A. mellifera within Apis, with Bombus as the designated outgroup.  Fig. 1a), and fails to resolve the deep phylogenetic separation of A. cerana / A. mellifera (cf. Fig. 1b and Supplementary Fig. S4).

Previous inferences from whole mitogenomes. Previous meta-analyses by Boardman and Eimanifar
et al. 11,[16][17][18][19][20][21][22]25,26 and Tihelka et al. 12 are compared with the present analysis in Fig. 6. The base phylogram is a Maximum Parsimony analysis calculated as in Fig. 2, with the addition of two problematic sequences noted in the text, KY926882 and KY926883, attributed to A. m. syriaca and A. m. intermissa, respectively. KY926882 " TB " falls outside the jemenitica cluster that includes A. m. syriaca KP163643, which is not used in either the " T " or " B " sets. KY936883 " T " is more closely related to KY936882 than to A. m. intermissa KM458618, which occurs where predicted biogeographically. There are indications of faulty data in KY936882 and KY936883: inspection of the last quarter of the aligned sequences shows two unbroken runs of seven shared phylogenetically informative sites 31  ] root their species-inclusive trees at the midpoint, rather than by an outgroup, which as drawn implies that the Dwarf and Giant Honey Bee species are sister groups, rather than that the former are the outgroup to the remaining species (cf. Fig. 1b). Their alignments include rDNA along with Coding Region sequences, which may contribute to the relatively weaker bootstrap support for some nodes: among subspecies of A. mellifera, my trial rDNA alignments were problematic at best, and these regions were excluded from the analysis here.    Fig. S5) gives a clade structure is similar to Fig. 2 here or that from their P12 model (except as to rooting), with bootstrap support for African and Mediterranean lineages again < 50%. Bootstrap support for other clades in Fig. 2 is strong. As shown by the molecular clock in Fig. 4, pairwise differences among nominal subspecies within the Southeast European, Asia Minor, and Mediterranean clades are less than those between individuals within other subspecies, even where within-clade relationships are well-defined. Where these local subspecies were originally defined by perceived morphological differences, review of their alpha and beta taxonomy may be indicated, with a view towards synonymizing local forms in gamma taxonomy: names matter 27 .
Additional phylogeographic inferences from partial mitogenome sequences. Besides (Supplementary Fig. S6 , and the co-occurrence of the two forms on Sicily suggests a secondary colonization from the westerly islands after in initial occupation from the European mainland. Likewise, the continental association of A. m. adami and A. m. cypria from Crete and Cyprus, respectively, is consistent with a dispersal corridor from southeastern Europe through the eastern Mediterranean and into the Levant, and suggests an eastern limit of the re-colonization of the insular Mediterranean at Sicily. The phylogeography proposed here accommodates these data well: further insight may be expected from complete mitogenomes. Previous inferences from nucDNA genome sequences. The "Thrice Out of Africa" hypothesis 9 follows a phylogenetic scheme that retrieved the same four groups (ACMO) as the then current continental geographic scheme based on meristics and morphology 8 . Their molecular network transposes the alphabetical order to an MAOC backbone, rooted with respect to a composite outgroup within the (A)frican subspecies, which along with Mellifera (M) was separated from the European (Continental) and Asian (Oriental) subspecies. Subsequent analyses of various nucDNA markers and combinations of A. mellifera subspecies broadly identify the MAOC groups 9,10,23,24,32,33,35 , however placement of the root on various branches according to internal and (or) external criteria varies, so as to suggest alternative geographic origins of A. mellifera. These include alternative interpretations of the Thrice Out of Africa data 29 [root indeterminate between M + A and O + C], as well as Thrice Out of Asia 9 and North African/Middle Eastern origin models 10,32,33,35 , for example with A. m. jemenitica in the Y lineage as outgroup to other subspecies of A.
There are fundamental differences between the MAOC backbone and mtDNA genome phylogenies (Supplementary Fig. S7), both with respect to placement of the root and allocation of subspecies to clades. No MAOC model places A. m. mellifera as outgroup to other subspecies: the mtDNA data place A. m. scutellata distal rather than proximal to the root of A. mellifera evolution. Closely related subspecies and geographically contiguous subspecies within the Mediterranean mtDNA clade here, including A. m. iberiensis and A. m. intermissa 9,32 , and A. m. ruttneri 24  Contrasts between nucDNA, mtDNA, and even morphological phylogenies are not unknown 29 . Maternallyinherited mtDNA phylogenies have the virtue of tracing maternal evolutionary lineages 36 , and may thus be particularly reliable for phylogeographic inferences about the origins and spread of "queen"-dispersed eusocial insect. A phylogeographic origin Out of Europe inferred from prior affinity with central Asian cavity-nesting bees seems more parsimonious that disjunct Sub-Saharan or Asian origins. anomalies within and among subspecies are in several cases artefacts of mis-assignment to the type subspecies. Incorrect phylogenetic inferences may also proceed from use of non-representative or faulty sequences. 6. Morphology-and (or) nucDNA-based biogeographic models are not consistent with this multi-mitogenome per taxon model. Genetic differences between nominal southeast European, Asia Minor, and Mediterranean subspecies are typically much less than those within other subspecies for which multiple individual sequences are available. Re-consideration of the alpha and gamma taxonomy of these taxa is indicated. Names matter.

Data availability
All mtDNA genome sequences used were obtained from and are available through NCBI GenBank. The accession numbers of the 66 Apis mellifera ssp sequences analyzed in detail are given in Table 1 and Supplementary Figs. S1-S3; 12 additional accessions referred to in Fig. 2 are also given there, and two questionable sequences are given in Fig. 6. Accession numbers of the sequences of nine species of Apis examined and the Bombus ignitus outgroup are given in Fig. 1a.